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Abstract 

We prove the following theorem about relative entropy of quantum states. 

Substate theorem: Let p and a be quantum states in the same Hilbert space with relative 
entropy S{p\\a) := Tr p(log p — log a) = c. Then for all e > 0, there is a state p' such that 
the trace distance \\p' - := Tr ^{p' - pf < e, and /o72°(^/'') < a. 

It states that if the relative entropy of p and a is small, then there is a state p' close to p, i.e. with small 
trace distance \\p' — p\\^^, that when scaled down by a factor 2*-''^'^^ 'sits inside', or becomes a 'substate' 
of, a. This result has several applications in quantum communication complexity and cryptography. 
Using the substate theorem, we derive a privacy trade-off for the set membership problem in the two- 
party quantum communication model. Here Alice is given a subset AC [n], Bob an input i E [n], and 
they need to determine if i £ A. 

Privacy trade-off for set membership: In any two-party quantum communication protocol 
for the set membership problem, if Bob reveals only k bits of information about his input, 
then Alice must reveal at least bits of information about her input. 

We also discuss relationships between various information theoretic quantities that arise naturally in the 
context of the substate theorem. 

1 Introduction 

The main contribution of this paper is a theorem, called the substate theorem; it states, roughly, that if the 
relative entropy, ^(pllcr) := Tr pilogp — log cr), of two quantum states p and a is at most c, then there a 
state p' close to sigma such that (J jlP^^^ sits inside a. This implies that, as we will formalise later, state 
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a can 'masquerade' as state p with probability 2"'^^'^) in many situations. Before we discuss the substate 
theorem, let us first see a setting in which it is applied in order to get some motivation. This application 
concerns the trade-off in privacy in two-party quantum communication protocols for the set membership 
problem IIMNSW98I . After that, we discuss the substate theorem proper followed by a brief description of 
several subsequent applications of the theorem. 

1.1 The set membership problem 

Definition 1 In the set membership problem SetM emb}^, Alice is given ci subset A. CI [?t,] and Bob an element 
i G [n]. The two parties are required to exchange messages according to a fixed protocol in order for the 
last recipient of a message to determine ifi£ [n]. We often think of Alice's input as a string x £ {0, 1}" 
which we view as the characteristic vector of the set A; the protocol requires that in the end the last recipient 
output Xi. In this viewpoint, Bob 's input i is called an index and the set membership problem is called the 
index function problem. 

The set membership problem is a fundamental problem in communication complexity. In the classical 
setting, it was studied by Miltersen, Nisan, Safra and Wigderson IIMNSW981 . who showed that if Bob sends 
a total of at most h bits, then Alice must send n/2'^('') bits. Note that this is optimal up to constants, as 
there is a trivial protocol where Bob sends the first b bits of his index to Alice, and Alice replies by sending 
the corresponding part of her bit string. The proof of Miltersen et al. relied on the richness technique they 
developed to analyse such protocols. However, here is a simple round-elimination argument that gives this 
lower bound, and as we will see below, this argument generalises to the quantum setting. Fix a protocol 
where Bob sends a total of at most b bits, perhaps spread over several rounds. We can assume without loss 
of generality that Bob is the last recipient of a message, otherwise we can augment the protocol by making 
Alice send the answer to Bob at the end which increases Alice's communication cost by one bit. Modify this 
protocol as follows. In the new protocol, Alice and Bob use shared randomness to guess all the messages 
of Bob. Alice sends her responses based on this guess. After this, if Bob finds that the guessed messages 
are exactly what he wanted to send anyway, he accepts the answer given by the original protocol; otherwise, 
he aborts the protocol. Thus, if the original protocol was correct with probability p, the new one-round 
protocol, when it does not abort, which happens with probability at least 2^^, is correct with probability at 
least p. A standard information theoretic argument of Gavinsky, Kempe, Regev and de Wolf l,GKRdW06J 
now shows that in any such protocol, Alice must send 2~* • n(l — H{p)) bits. 

In the quantum setting, a special case of the set membership problem was studied by Ambainis, Nayak, 
Ta-Shma and Vazirani IIANTV0 2I. where Bob is not allowed to send any message and there is no prior 
entanglement between Alice and Bob. They referred to this as quantum random access codes, because in 
this setting the problem can be thought of as Alice encoding n classical bits x using qubits in such a way 
that Bob is able to determine any one Xi with probability at least p > \. Note that in the quantum setting, 
unlike in its classical counterpart, it is conceivable that the measurement needed to determine xi makes the 
state unsuitable for determining any of the other bits Xj. In fact, Ambainis et al. exhibit a quantum random 
access code encoding two classical bits (xi, X2) into one qubit such that any single bit Xj can be recovered 
with probability strictly greater than 1/2, which is impossible classically. Their main result, however, was 
that any such quantum code must have n{l — H{p)) qubits. They also gave a classical code with encoding 
length n(l — H{p)) + O(logn), thus showing that quantum random access codes provide no substantial 
improvement over classical random access codes. 

In this paper, we study the general set membership problem, where Alice and Bob are allowed to ex- 
change quantum messages over several rounds as well as share prior entanglement. Ashwin Nayak (private 
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communication) observed that the classical round elimination argument described above is applicable in the 
quantum setting: if AUce and Bob share prior entanglement in the form of EPR pairs, then using quan- 
tum teleportation ||BBC"'"93 I. Bob's messages can be assumed to be classical. Now, Alice can guess Bob's 
messages, and we can combine the classical round elimination argument above with the results on random 
access codes to show that Alice must send at least 2^(2^+^) • n(l — H{p)) qubits to Bob. 

We strengthen these results and show that this trade-off between the communication required of Alice 
and Bob is in fact a trade-off in their privacy: if a protocol has the property that Bob 'leaks' only a small 
number of bits of information about his input, then in that protocol Alice must leak a large amount of 
information about her input; in particular, she must send a large number of qubits. Before we present our 
result, let us explain what we mean when we say that Bob leaks only a small number of bits of information 
about his input. Fix a protocol for set membership. Assume that Bob's input J is a random element of [n]. 
Suppose Bob operates faithfully according to the protocol, but Alice deviates from it and manages to get her 
registers, say A, entangled with J: we say that Bob leaks only h bits of information about his input if the 
mutual information between J and A, I{J : A), is at most h. This must hold for all strategies adopted by 
Alice. Note that we do not assume that Bob's messages contain only h qubits, they can be arbitrarily long. In 
the quantum setting, Alice has a big bag of tricks she can use in order to extract information from Bob. See 
Section [3?T] for an example of a cheating strategy for Alice, that exploits Alice's ability to perform quantum 
operations. We show the following result. 

Result 1 (informal statement) If there is a quantum protocol for the set membership problem where Bob 
leaks only b bits of information about his input J, then Alice must leak n{n/2^^^^) bits of information about 
her input x. In particular, this implies that Alice must send njlP'^^ qubits. 

Related work: One can compare this with work on private information retrieval IICKGS98II . There, one 
requires that the party holding the database x know nothing about the index i. Nayak | Nay99] sketched an 



argument showing that in both classical and quantum settings, the party holding the database has to send 
r2(n) bits/qubits to the party holding the index. Result[T]generalises Nayak's argument and shows a trade-off 
between the loss in privacy for the database user Bob, and the loss in privacy for the database server Alice. 

Recently, Klauck IIKla02l studied privacy in quantum protocols. In Klauck's setting, two players col- 
laborate to compute a function, but at any point, one of the players might decide to terminate the protocol 
and try to infer something about the input of the other player using the bits in his possession. The players 
are honest but curious: in a sense, they don't deviate from the protocol in any way other than, perhaps, by 
stopping early. In this model, Klauck shows that there is a protocol for the set disjointness function where 
neither player reveals more than 0((log n)^) bits of information about his input, whereas in every classical 
protocol, at least one of the players leaks Q{^/n/ logn) bits of information about his input. Our model 
of privacy is more stringent. We allow maUcious players who can deviate arbitrarily from the protocol. 
An immediate corollary of our result is that for the set membership problem, one of the players must leak 
r2(log n) bits of information. This implies a similar loss in privacy for several other problems, including the 
set disjointness problem. 



Privacy trade-off and the substate theorem: We now briefly motivate the need for the substate theorem 
in showing the privacy trade-off in Result [T] above. We know from the communication trade-off argument 
for set membership presented above that in any protocol for the problem, if Bob sends only b qubits, then 
Alice must send n/2*^(^) qubits. Unfortunately, this argument is not applicable when the protocol does not 
promise that Bob sends only b qubits, but only ensures that the number of bits of information Bob leaks is 
at most b. So, the assumption is weaker. On the other hand, the conclusion now is stronger, for it asserts 
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that Alice must leak bits of information, which implies that she must send at least these many 

qubits. The above argument relied on the fact that Alice could generate a distribution on messages, so that 
every potential message of Bob is well-represented in this distribution: if Bob's messages are classical and 
b bits long, the uniform distribution is such a distribution — each b bit message appears in it with probability 
2^^. Note that we are not assuming that messages of Bob have at most b qubits, so Alice cannot guess 
these messages in this manner. Nevertheless, using only the assumption that Bob leaks at most b bits of 
information about his input, the substate theorem provides us an alternative for the uniform distribution. It 
allows us to prove the existence of a single quantum state that Alice and Bob can generate without access to 
Bob's input, after which if Bob is provided the input i, he can obtain the correct final state with probability 
at least 2~'^(*) or abort if he cannot. After this, a quantum information theoretic argument of Gavinsky, 
Kempe, Regev and de Wolf l,GKRdW06 1 implies that Alice must leak at least n/2^^'^'> bits of information 
about her input. The proof is discussed in detail in Section [3] 



1.2 The substate theorem 

It will be helpful to first consider the classical analogue of the substate theorem. Let P and Q be probability 
distributions on the set [n] such that their relative entropy is bounded by c, that is 

5(P||Q):= j;P(z)log2^ < c (1) 

When c is small, this implies that P and Q are close to each other in total variation distance; indeed, one 
can show that (see e.g. I,CT91, Lemma 12.6.1]) 

11^ - QWi ■■= E - ^ 7(21^. (2) 

That is, the probability of an event £ C [n] in P is close to its probability in Q: \P{£) — Q{£)\ < 
Y^(cln2)/2. Now consider the situation when c » 1. In that case, expression Q becomes weak, and it is 
not hard to construct examples where ||P — is very close to 2. Thus by bounding ||P — QH^ alone, we 
cannot infer that an event £ with probability 3/4 in P has any non-zero probability in Q. But is it true that 
when S{P\\Q) < +oo and P{£) > 0, then Q{£) > 0? Yes! To see this, let us reinterpret the expression 
in ([T]) as the expectation of log P{i)/Q{i) as i is chosen according to P. Thus, one is lead to believe that if 
S{P\\Q) < c < +00, then \ogP{i) /Q{i) is typically bounded by c, that is, P{i)/Q{i) is typically bounded 
by 2'^. One can formalise this intuition and show, for all r > 1, 



Pr 



< -• (3) 
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We now briefly sketch a proof of the above inequality. Let Good := {i : P{i)/2^^'^^^^ < Q{i)}, Bad := 
[n] \ Good. By concavity of the logarithm function, we get 

By elementary calculus, P(Good) log ^1^°°^^ > —1. Thus we get P(Bad) • r(c + 1) < c + 1, proving the 
above inequality. 
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We now define a new probability distribution P' as follows: 



pUi^ .= J p£SdT ^ e Good 
^ ^ ■ [ i G Bad ' 

that is, in P' we just discard the bad values of i and renormalise. Now, P' is dominated by Q 

everywhere. We have thus shown the classical analogue of the desired substate theorem. 

Result 12]' (Classical substate theorem) Let P, Q be probability distributions on the same sample space 
with S{P\\Q) < c. Then for all r > 1, there exist distributions P',P" such that \\P — P'\\-^ < and 
Q = aP' + {l- a)P", where a := 

Let us return to our event £ that occurred with some small probability p in P. Now, if we take r to 
be 2/p, then £ occurs with probability at least p/2 in P', and hence appears with probability p/2'^(^/^) in 
Q. Thus, we have shown that even though P and Q are far apart as distributions, events that have positive 
probability, no matter how small, in P, continue to have positive probability in Q. 

The main contribution of this paper is a quantum analogue of Result |2f ■ To state it, we recall that the 
relative entropy of two quantum states p, a in the same Hilbert space is defined as S'(pllcj) := Tr p(log p — 
log o"), and the trace distance between them is defined as ||/9 — := Tr \/ {p — p')^- 

Result 2 (Quantum substate theorem) Suppose p and a are quantum states in the same Hilbert space with 
SipWc) < c. Then for all r > 1, there exist states p', p" such that ||p — p'||tr — '^^'^ ^ — ap' + {^ — a)p", 
where a := and c' := c + 4Vc + 2 + 2 log(c + 2) + 5. 

The quantum substate theorem has been stated above in a form that brings out the analogy with the classical 
statement in Result |2j. In Section HJ we have a more nuanced statement which is often better suited for 
applications. 



Remark: Using the quantum substate theorem and arguing as above, one can conclude that if an event £ 
has probability p in p, then its probability g in cr is at least q > ^^j^j^, c = S{p\\a). Actually, one can 
show the stronger result that q > ^oli/p) follows. Using the fact that relative entropy cannot increase after 
doing a measurement, we get 

p 1 — p 

plog- + (1 -p)log- < S{p\\a) < c. 

q l-q 

We now argue as in the proof of Result [2f to show the stronger lower bound on q. 

In view of this, one may wonder if there is any motivation at all in proving a quantum substate theorem. 
Recall however, that the quantum substate theorem gives a structural relationship between p and a which 
is useful in many applications e.g. privacy trade-off for set membership discussed earlier. It does not 
seem possible in these applications to replace this structural relationship by considerations about the relative 
probabilities of an event £ in p and a. In our privacy trade-off application, a plays the role of the state 
that Alice and Bob can generate without access to Bob's input, and p plays the role of the correct final 
state of Bob in the protocol. To prove the trade-off, a should be able to 'masquerade' as p with probability 
2~0{b)^ 6 being the amount of information Bob leaks about his input. Also, Bob should know whether the 
'masquerade' succeeded or not so that he can abort if it fails, and it is this requirement that needs the substate 
property. 
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The ideas used to arrive at Result |2l do not immediately generalise to prove Result |2j because p and 
a need not be simultaneously diagonalisable. As it turns out, our proof of the quantum substate theorem 
takes an indirect route. First, by exploiting the Fuchs and Caves |FC95| characterisation of fidelity and a 
minimax theorem of game theory, we obtain a 'lifting' theorem about an 'observational' version of relative 
entropy; this statement is interesting on its own. Using this 'lifting' theorem, and a connection between the 
'observational' version of relative entropy and actual relative entropy, we argue that it is enough to verify 
the original statement when p and a reside in a two-dimensional space and p is a pure state. The two 
dimensional case is then established by a direct computation. 

1.3 Other applications of the substate theorem 

The conference version of this paper IIJRS02II . in which the substate theorem was first announced, described 
two applications of the theorem. The first application provided tight privacy trade-offs for the set mem- 
bership problem, which we have discussed above. This application is a good illustration of the use of 
the substate theorem, for several applications have the same structure. The second application showed tight 
lower bounds for the pointer chasing problem ||NW931lKNTZ01l . thereby establishing that the lower bounds 
shown by Ponzio, Radhakrishnan and Venkatesh MPRVOll in the classical setting are valid also for quantum 
protocols without prior entanglement. 

Subsequent to IIJRS02i . several applications of the classical and quantum substate theorems have been 
discovered. We briefly describe these results now. Earlier, in related but independent work Chakrabarti, 
Shi, Wirth and Yao IICSWYOlll discovered their very influential information cost approach for obtaining 
direct sum results in communication complexity. Jain, Radhakrishnan and Sen IIJRS03II observed that the 
arguments used by Chakrabarti et al. could be derived more systematically using the classical substate 
theorem; this approach allowed them to extend Chakrabarti et al. 's direct sum results, which applied only to 
one-round and simultaneous message protocols under product distributions on inputs, to two-party multiple 
round protocols under product distributions on inputs. Ideas from [ JRS031 were then applied by Chakrabarti 
and Regev IICR04I to obtain their tight lower bound on data structures for the approximate nearest neighbour 
problem on the Hamming cube. 

The quantum substate theorem, the main result of this paper, has also found several other applications. 
Jain, Radhakrishnan and Sen [JRS05| used it to show how any two-party multiple round quantum protocol 
where Alice leaks only a bits of information about her input and Bob leaks only b bits of information about 
his, can be transformed to a one-round quantum protocol with prior entanglement where Alice transmits 
just a2^^^^ bits to Bob. Note that plain Schumacher compression IISch95l cannot be used to prove such 
a result, since we require a 'one-shot' as opposed to an asymptotic result, there can be interaction in a 
general communication protocol, as well as the case that the reduced state of any single party can be mixed. 
Jain et al.'s compression result gives an alternative proof of Result [TJ because the work of Ambainis et 
al. IIANTV02II implies that in any such protocol for set membership Alice must send Q{n) bits to Bob. 
Jain et al. also used the classical and quantum substate theorems to prove worst case direct sum results 
for simultaneous message and one round classical and quantum protocols, improving on I JRS03 I . More 
recently, using the quantum substate theorem Jain IIJai06ll obtained a nearly tight characterisation of the 
communication complexity of remote state preparation, an area that has received considerable attention 
lately. The substate theorem has also found application in the study of quantum cryptographic protocols: 
using it, Jain | Jai05 l showed nearly tight bounds on the binding-concealing trade-offs for quantum string 
commitment schemes. 
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1.4 Organisation of the rest of the paper 

In the next section, we recall some basic facts from classical and quantum information theory that will 
be used in the rest of the paper. In Section [3l we formally define our model of privacy loss in quantum 
communication protocols and prove our privacy trade-off result for set membership assuming the substate 
theorem. In Section HI we give the actual statement of the substate theorem that is used in our privacy trade- 
offs, and a complete proof for it. Sections [3] and |4] may be read independently of each other. In Section [5] 
we mention some open problems, and finally in the appendix we discuss relationships between various 
information theoretic quantities that arise naturally in the context of the substate theorem. The appendix 
may be read independently of Section [3l 

2 Information theory background 

We now recall some basic definitions and facts from classical and quantum information theory, which will 
be useful later. For excellent introductions to classical and quantum information theory, see the books by 
Cover and Thomas IICT91II and Nielsen and Chuang MNCOOI respectively. 

In this paper, all functions will have finite domains and ranges, all sample spaces will be finite, all 
random variables will have finite range and all Hilbert spaces finite dimensional. All logarithms are taken to 
base two. We start off by recalling the definition of a quantum state. 

Definition 2 (Quantum state) A quantum state or a density matrix in a Hilbert space TC is a Hermitian, 
positive semidefinite operator on TC with unit trace. 

Note that a classical probability distribution can be thought of as a special case of a quantum state with 
diagonal density matrix. An important class of quantum states are what are known as pure states, which are 
states of the form where IV') is a unit vector in Ti. Often, we abuse notation and refer to |^) itself as 

the pure quantum state; note that this notation is ambiguous up to a multiplicative unit complex number. 

Let 7^, /C be two Hilbert spaces and u a quantum state in the bipartite system H0IC. The reduced quan- 
tumstate of His givenhy tracing out JC, ahoknov/nsiSthepartial traceTtjc '■= J2k(^'K^i^\)^(^'H^\^)) 
where is the identity operator on Tl and the summation is over an orthonormal basis for JC. It is easy to 
see that the partial trace is independent of the choice of the orthonormal basis for /C. For a quantum state p 
in H, any quantum state lu mTl(iS> IC such that Tr/c lo = pis said to be an extension of p in /C; if lj is 
pure, it is said, more specifically, to be a purification. 

We next define a POVM element, which formahses the notion of a single outcome of a general measure- 
ment on a quantum state. 

Definition 3 (POVM element) A POVM (positive operator valued measure) element F on Hilbert space 
TC is a Hermitian positive semidefinite operator on TC such that F < 1, where 1 is the identity operator on 
TC. 

If /9 is a quantum state in TC, the success probability of p under POVM element F is given by Tr {Fp). 

We now define a POVM which represents the most general form of a measurement allowed by quantum 
mechanics. 

Definition 4 (POVM) A POVM T on Hilbert space TC is a finite set of POVM elements {Fi ,...,Fk} onTC 
such that Yli=i — ^> where 1 is the identity operator on TC. 
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If p is a quantum state in Ti, let Tp denote the probability distribution {pi, . . . ,pk} on [k], where pi := 
Tr {Fip). 

Typically, the distance between two probability distributions P, Q on the same sample space 17 is mea- 
sured in terms of the total variation distance defined as \\P — Q\\-^ := ^jg^ ~ The quantum 
analogue of the total variation distance is known as the trace distance. 

Definition 5 (Trace distance) Let p, a be quantum states in the same Hilbert space. Their trace distance is 
defined as \\p — a\\^^ := Tr \/ {p — cr)^. 

If we think of probability distributions as diagonal density matrices, then the trace distance between them 
is nothing but their total variation distance. For pure states \'tp),\(j)) it is easy to see that their trace distance 
is given by |||^)(^| — |</')(</'| ||tr = — The following fundamental fact shows that the trace 

distance between two density matrices bounds how well one can distinguish between them by a POVM. A 
proof can be found in ||AKN98i 

Fact 1 Let p, a be density matrices in the same Hilbert space TC. Let T be a POVM on 7i. Then, 
\\J^p — -^clli < Hp — Also, there is a two-outcome orthogonal measurement that achieves equal- 
ity above. 

Another measure of distinguishability between two probability distributions P, Q on the same sample 
space is the Bhattacharya distinguishability coefficient defined as B{P,Q) := X^jgf^ ^/P{i)Q{i). Its 
quantum analogue is known as fidelity. We will need several facts about fidelity in order to prove the 
quantum substate theorem. 

Definition 6 (Fidelity) Let p, a be density matrices in the same Hilbert space TC. Their fidelity is defined 
as B{p, a) := Tr yQ^pa^/p. 

The fidelity, or sometimes its square, is also referred to as the "transition probability" of Uhlmann. For 
probability distributions, the fidelity turns out to be the same as their Bhattacharya distinguishability coef- 
ficient. Jozsa |Joz94|| gave an elementary proof for finite dimensional Hilbert spaces of the following basic 
and remarkable property about fidelity. 

Fact 2 Let p,a be density matrices in the same Hilbert space Ti. Then, B(p,a) = supy,; KV'li^)!, 
where IC ranges over all Hilbert spaces and range over all purifications of p,a respectively in 

7i®lC. Also, for any Hilbert space fC such that dim(/C) > dim('H), there exist purifications | ^/;) , | </>) o/ p, cr 
in TL® K,, such that B{p, a) = \ {tplcp) \. 

We will also need the following fact about fidelity, proved by Fuchs and Caves ||FC95I. 

Fact 3 Let p, a be density matrices in the same Hilbert space TC. Then B{p, a) = infj^ B{J^p, J- a), where 
T ranges over POVM s on TC. In fact, the infimum above can be attained by a complete orthogonal measure- 
ment on TC. 

The most general operation on a density matrix allowed by quantum mechanics is what is called a 
completely positive trace preserving superoperator, or superoperator for short. Let TC,lChe Hilbert spaces. 
A superoperator T from TC to K, maps quantum states p in to quantum states Tp in /C, and is described 
by a finite collection of linear maps {^i, . . . , Ai} from TC to JC called Kraus operators such that, Tp = 
Yl\=i AipA\. Unitary transformations, taking partial traces and POVMs are special cases of superoperators. 

We will use the notation A> Bfor Hermitian operators A, B in the same Hilbert space as a shorthand 
for the statement '^4 — i? is positive semidefinite'. Thus, A > denotes that A is positive semidefinite. 
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Let X be a classical random variable. Let P denote the probability distribution induced by X on its range 
The Shannon entropy of X is defined as H{X) := H{P) := — X^jg^ P{i) log P{i)- For any < p < 1, 
the binary entropy of p is defined as H (p) := H{{p,l—p)) = —plogp—{l—p)log{l—p). If ^ is a quantum 
system with density matrix p, then its von Neumann entropy S{A) := S{p) := — Tr plog p. It is obvious that 
the von Neumann entropy of a probability distribution equals its Shannon entropy. If A, B are two disjoint 
quantum systems, the mutual information of A and B is defined as I{A : B) := S{A) + S{B) — S{AB); 
mutual information of two random variables is defined analogously. By a quantum encoding M of a classical 
random variable X on m qubits, we mean that there is a bipartite quantum system with joint density matrix 
Vt[X = x] • |x) (x| (g) px, where the first system is the random variable, the second system is the quantum 
encoding and an x in the range of X is encoded by a quantum state px on m qubits. The reduced state of the 
first system is nothing but the probability distribution Pr [X = x] • |x) (rrl on the range of X. The reduced 
state of the second system is the average code word p := Pr[X = x] • px- The mutual information of 
this encoding is given by 

I{X : M) = S{X) + S{M) - S{XM) = S{p) - ^ Pr[X = x] ■ S{px). 

X 

We now define the relative entropy of a pair of quantum states. 

Definition 7 (Relative entropy) If p,(J are quantum states in the same Hilbert space, their relative entropy 
is defined as S{p\\a) := Tr (/9(log p — log cr)). 

For probability distributions P, Q on the same sample space Q., the above definition reduces to S{P\\Q) = 
Sieo -^(^) "^^^ following fact lists some useful properties of relative entropy. Proofs can be found 

in MNCOOl Chapter 11]. The monotonicity property below is also called Lindblad-Uhlmann monotonicity . 

Fact 4 Let p, a be density matrices in the same Hilbert space 7i. Then, 

1. S{p\\a) > 0, with equality iff p = a; 

2. S{p\\a) < +00 iffsupp{p) C supp(cT), where supp(/?) denotes the support of p i.e. the span of the 
eigenvectors corresponding to non-zero eigenvalues of p; 

3. S{-\\-) is continuous in its two arguments when it is not infinite. 

4. (Unitary invariancej IfU is a unitary transformation on 7i, S{UpU'^\\UaU'^) = ^(pllfT). 

5. (Monotonicity j Let C be a Hilbert space and T be a completely positive trace preserving superoper- 
ator fromTL to C. Then, S{Tp\\Ta) < S{p\\a). 

The following fact relates mutual information to relative entropy, and is easy to prove. 

Fact 5 Let X be a classical random variable and M be a quantum encoding of X i.e. each x in the range 
of X is encoded by a quantum state px- Let p := Pr[X = x] ■ px be the average code word. Then, 
I{X : M) = T.x'^t[X = x]. S{px\\p). 

The next fact is an extension of the random access code arguments of IIANTV02II . and was proved by 
Gavinsky, Kempe, Regev and de Wolf IIGKRdW061 Lemma 1]. 
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Fact 6 Let X = Xi • • • Xn be a classical random variable of n uniformly distributed bits. Let M be a 
quantum encoding of X on m qubits. For each i G [n], suppose there is a POVM Ti on M with three 
outcomes 0, 1, ?. Let Yi denote the random variable obtained by applying Fi to M. Suppose there are real 
numbers < Aj,ej < 1 such that Pr[l^ 7^ ?] > Aj and Pr[l^ = Xi \ Yi ^ ?] > 1/2 + e^, where the 
probability arises from the randomness in X as well as the randomness of the outcome of Ti. Then, 

n n 

Xiej < Xi{l - H{l/2 + a)) < I{X :M)<m. 
1=1 i=i 

3 Privacy trade-offs for set membership 

In this section, we prove a trade-off between privacy loss of Alice and privacy loss of Bob for the set 
membership problem SetMemb„ assuming the substate theorem. We then embed index function into other 
functions using the concept of VC-dimension and show privacy trade-offs for some other problems. But 
first, we formally define our model of privacy loss in quantum communication protocols. 

3.1 Quantum communication protocols 

We consider two party quantum communication protocols as defined by Yao IIYao93ll . Let X, y, Z be sets 
and f : X X y ^ Zhe. & function. There are two players Alice and Bob, who hold qubits. Alice gets an 
input X ^ X and Bob an input y ^ y. When the communication protocol V starts, Alice and Bob each 
hold some 'work qubits' initialised in the state |0). Alice and Bob may also share an input independent 
prior entanglement. Thus, the initial superposition is simply \^) A\'^)\y) b\^) B, where is a pure state 
providing the input independent prior entanglement. Here the subscripts denote the ownership of the qubits 
by Alice and Bob. Some of the qubits of IV') belong to Alice, the rest belong to Bob. The players take 
turns to communicate to compute f{x,y). Suppose it is Alice's turn. Alice can make an arbitrary unitary 
transformation on her qubits depending on x only and then send some qubits to Bob. Sending qubits does 
not change the overall superposition, but rather the ownership of the qubits, allowing Bob to apply his next 
unitary transformation, which depends on y only, on his original qubits plus the newly received qubits. 
At the end of the protocol, the last recipient of qubits performs a measurement in the computational basis 
of some qubits in her possession to output an answer V{x,y). For each {x,y) G X x y the unitary 
transformations that are applied, as well as the qubits that are to be sent in each round, the number of 
rounds, the choice of the starting player, and the designation of which qubits are to be treated as 'answer 
qubits' are specified in advance by the protocol V. We say that V computes / with e-error in the worst 
case, if max^ ^ Pr['P(x, y) 7^ f{x,y)] < e. We say that V computes / with e-error with respect to a 
probability distribution fion X x y, if E ^[Pr[P(j;, y) 7^ f{x,y)]] < e. The communication complexity 
of V is defined to be the total number of qubits exchanged. Note that seemingly more general models of 
communication protocols can be thought of, where superoperators may be applied by the parties instead of 
unitary transformations and arbitrary POVM to output the answer of the protocol instead of measuring in 
the computational basis, but such models can be converted to the unitary model above without changing the 
error probabilities, communication complexity, and as we will see later, privacy loss to a cheating party. 

Given a probability distribution fi on X x y we define Ifx) := Yl{x y)eXxy \/ l^i^^v) Run- 
ning protocol V with superposition |^) fed to Alice's and Bob's inputs means that we first create the state 
'^{x y)(^Xxy \J y)k)|O)A|'0)|O)B|y), then feed the middle three registers to V and let V run its course 
till just before applying the final measurement to determine the answer of the protocol. We define the success 
probability of V when is fed to Alice's and Bob's inputs to be the probability that measuring the inputs 



10 



and the answer qubits in the computational basis at the end of V produces consistent results. Similarly, 
running protocol V with mixture n fed to Alice's and Bob's inputs is defined in the straightforward fashion. 
It is easy to see that the success probability of V on superposition is the same as the success probability 
on mixture ji, that is, the success probability on superposition is equal to E^\Pt\P{x, y) = f{x, y)]]. 

Now let fix, IJ-y be probability distributions onX,y, and let /i := fix x IJ-y denote the product distribu- 
tion onX xy. Let V be the prescribed honest protocol for /. Now let us suppose that Bob turns 'malicious' 
and deviates from the prescribed protocol V in order to learn as much as he can about Alice's input. Note 
that Alice remains honest in this scenario i.e. she continues to follow V. Thus, Alice and Bob are now 
actually running a 'cheating' protocol V. Let registers A, X, B, Y denote Alice's work qubits, Alice's input 
qubits, Bob's work qubits and Bob's input qubits respectively at the end of V. The privacy leakage from 
Alice to Bob in V is captured by the mutual information I{X : BY) between Alice's input register and 
Bob's qubits in V. We want to study how large sup I{X : BY) can be for a given function /, product 
distribution fi, and protocol V, where the supremum is taken over all 'cheating' protocols V wherein Bob 
can be arbitrarily malicious but Alice continues to follow V honestly. We shall call this quantity the privacy 
loss of V from Alice to Bob. Privacy leakage and privacy loss from Bob to Alice can be defined similarly. 

One of the ways that Bob can cheat (even without Alice realising it!) is by running V with the super- 
position \fiy) := J2yey V l^yiy) \y) ^^'^ ^'^ register Y. This method of cheating gives Bob at least as much 
information about Alice's input as in the 'honest' run of V when the mixture fiy is fed to Y. Sometimes it 
can give much more. Consider the set membership problem, where Alice has a bit string x which denotes 
the characteristic vector of a subset of [n] and Bob has an i G [n]. Consider a clean protocol V for the 
index function problem. Recall that a protocol V is said to be clean if the work qubits of both the players 
except the answer qubits are in the state |0) at the end of V. We shall show a privacy trade-off result for V 
under the uniform distribution on the inputs of the two players. For simplicity, assume that V is errorless (an 
error of 1/4 will only change the privacy losses by a multiplicative constant). Alice can cheat by feeding a 
uniform superposition over bit strings into her input register X, and then running V. Bob is honest, and has 
a random i G [n]. At the end of this 'cheating' run of V, Alice applies a Hadamard transformation on each 
of the registers Xj,l < j < n. Suppose she were to measure them now in the computational basis. For 
all j / i, she would measure |0) with probability 1. For j = i, she would measure 1 with probability 1/2. 
Thus, Alice has extracted about log n/2 bits of information about Bob's index i. An 'honest' run of V would 
have yielded Alice only 1 bit of information about i. Klauck IIKla02l . based on Cleve et al. IICvDNT98l . has 
made a similar observation about r2(n) privacy loss for clean protocols computing the inner product mod 2 
function. The significance of our lower bounds on privacy loss is that they make no assumptions about the 
protocol V. 

We now define a superpositional privacy loss inspired by the above example. We consider a 'cheating' 
run of V when mixture fix is fed to register X and superposition \ fLy) to register Y . Let I'{X : BY) denote 
the mutual information of Alice's input register X with Bob's registers BY at the end of this 'cheating' run 
oiV. 

Definition 8 (Superpositional privacy loss) The superpositional privacy loss of V for function f on the 
product distribution fi from Alice to Bob is defined as L^{f, /i, A, B) := I'{X : BY). The superpositional 
privacy loss from Bob to Alice, L^{f, fi, B, A), is defined similarly. The superpositional privacy loss ofV 
for f, L^{f), is the maximum over all product distributions fi, o/max{L^(/, /i, A, B), L^{f, fi, B, A)}. 

Remarks: 

1. Our notion of superpositional privacy loss can be viewed as a quantum analogue of the "combinatorial- 
informational" bounded error measure of privacy loss, in Bar- Yehuda et. al IIBCK093II . 
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2. In IIKla02ll . Klauck defines a similar notion of privacy loss. In his definition, a mixture according to 
distribution /i (not necessarily a product distribution) is fed to both Alice's and Bob's input registers. He 
does not consider the case of superpositions being fed to input registers. For product distributions, our notion 
of privacy is more stringent than Klauck's, and in fact, the L^{f, fi, A, B) defined above is an upper bound 
(to within an additive factor of log |^|) on Klauck's privacy loss function. 

3. We restrict ourselves to product distributions because we allow Bob to cheat by putting a superposition 
in his input register Y . He should be able to do this without any a priori knowledge of x, which implies that 
the distribution /i should be a product distribution. 4. The (general) privacy loss defined above is trivially 
an upper bound on the superpositional privacy loss. 



3.2 The privacy trade-off result 

Theorem 1 Consider a quantum protocol V for SetMemb„ where Alice is given a subset of [n] and Bob an 
element ofn. Let (j, denote the uniform probability distribution on Alice's and Bob's inputs. Suppose V has 
error at most 1/2 — e with respect to fi. Suppose L^(SetMembn, B, A) < k. Then, 

L^(SetMemb„,/x,Ai?) > ^Fa(5p:5i) " 2- 

Proof: Let registers A, X, B, Y denote Alice's work qubits, Alice's input qubits, Bob's work qubits and 
Bob's input qubits respectively, at the end of protocol V. We can assume without loss of generality 
that the last round of communication in V is from Alice to Bob, since otherwise, we can add an extra 
round of communication at the end wherein Alice sends the answer qubit to Bob. This process increases 
L^(SetMemb„, /x, A, B) by at most two and does not increase L^(SetMemb„, /i, B, A) (see e.g. the infor- 
mation theoretic arguments in I CvDNT9 81). Thus at the end of V, Bob measures the answer qubit, which 
is a qubit in the register B, in the computational basis to determine f{x, y). In the proof, subscripts of pure 
and mixed states will denote the registers which are in those states. 

Let \ipi)xAYB be the state vector of Alice's and Bob's qubits and {pi)xA the density matrix of Alice's 
qubits at the end of the protocol V, when Alice is fed a uniform superposition over bit strings in her input 
register X and Bob is fed \i) in his input register Y. Let 1/2 + ej be the success probability of V in 
this case. Without loss of generality, ei > 0. Consider a run, Run 1, of "P when a uniform mixture of 
indices is fed to register Y, and a uniform superposition over bit strings is fed to register X. Let 1/2 + e 
be the success probability of V for Run 1, which is also the success probability of V with respect to //. 
Then 1/4 < e = (1/n) ^^^^^ £». Let Ii{Y : AX) denote the mutual information of register Y with 
registers AX at the end of Run 1 of V. We know that Ii{Y : AX) = L^(SetMembn, fi, B, A) < k. Let 
PXA ■■= (1/n) Er=i(/'i)xA and h := S{{pi)xA\\pxA)- Note that < /c^ < cx) by FactH By FactEl 

k>hiY:AX) = -y] S{{p^)xA\\pxA) = -y^h. 

i=l i=l 

Let k[ := h + A./k~p2 + 2 log(A;i + 2) + 5 and n := (2/ei)^. 

Let us now consider a run. Run 2, of V with uniform superpositions fed to registers X, Y . Let \(j))xAYB 
be the state vector of Alice's and Bob's qubits at the end of Run 2 of V. Then, Trys | </>)(</>! = pxA, and 
the success probability of V for Run 2 is 1/2 + e. Let Q be an additional qubit. By the substate theorem 
(Theorem llll, there exist states \'tl}[)xAYBQ, \0'i)xAYBQ such that |||V'j)(V'j| - IV'i) (V'il lltr ^ 2/^ = 
and TiYBQ \(t)i){4>i\= pxA where 




\<Pi)xAYBQ ■■= \l ^-^;;;rir li^i) X AY b\1) Q + J ^ " \0'i) X AY b\0) c 
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In fact, there exists a unitary transformation Ui on registers YBQ, transforming the state {(p) x AY b\0) q to 
the state \4>i)xAYBQ- 

For each i G [n], let X- denote the classical random variable got by measuring the ith bit of register X 
in state \(i))xAYB- We now prove the following claim. 

Claim 1 For each i £ [n], there is a POVM Mi with three outcomes 0, 1, ? acting on YB such that if Z'- is 
the result of Mi on \(I))xayb, then Vi[Z[ / ?] > 2-<'^K+^\ and Fi[Z'- = X[ \ Z[^l]>l/2 + e^/Z 

Proof: The POVM Mi proceeds by first bringing in the ancilla qubit Q initialised to |0)q, then applying Ui 
to the registers YBQ and finally measuring Q in the computational basis. If it observes |1)q, Mi measures 
the answer qubit in B in the computational basis and declares the result as Z[. If it observes |0)q, Mi 
outputs ?. 

When applied to \4>)xayb, -M-i first generates \(pi)xAYBQ and then measures Q in the computational 
basis. In the case when Mi measures |1) for qubit Q, which happens with probability 

ri - 1 . A.-2(- 

the state vector of XAYB collapses to lipi)- In this case by Fact[T] 

Pr[Zl = Xl\Zl / ?] > 1 + - i - \^P'^{^l;'i\\l > ^ + f- 

■ 

Consider now a run, Run 3, of "P when a uniform mixture over bit strings is fed to register X and a 
uniform superposition over [n] is fed to register Y. Let pxAYB denote the density matrix of the registers 
XAYB at the end of Run 3 of V. In fact, measuring in the computational basis the register X in the state 
\(l))xAYB gives us pxAYB', also, Try^ pxAYB = PXA- Let I3{X : YB) denote the mutual information 
between register X and registers YB in the state pxAYB- For each i G [n], let Xi denote the classical 
random variable corresponding to the ith bit of register X in state pxAYB- Then, X := Xi . . . Xn is a 
uniformly distributed bit string of length n. Let Zi denote the result of POVM Mi of the above claim applied 
to PXAYB- Then since Mi acts only on the registers YB, we get Pr[Zi / ?] = Pr[Z- / ?] > (M+i), 
and PrjZi = X, | / ?] = Pr[Z^ = X'^ \ Zi ^ ?] > 1/2 + ei/2. Define Good := {i G [n] : k, < 
2k/e,ei > e/2}. By Markov's inequality, |Good| > ne/2. By Fact|6l 



IiX:YB) > Y.-^ ^ E 

i=l tgGood 

„g3 . 2e"^(2fc+4V2fc+2+21og(2fc+2)+6) ^ 
> > 



> 



32 ~ 2^"^(2*:+4V2FF2+21og(2fc+2)+12) 

n 



2e-^{14A;+24) ' 

By the arguments in the first paragraph of this proof, we have L^(SetMemb„, ^, A, B) > I{X : YB) — 2. 
This completes the proof of the theorem. ■ 
Remark: This theorem is the formal version of Result [T] stated in the introduction. 

As we have mentioned earlier, this theorem has been generalised in [JRS051 in a suitable manner to 
relate the privacy loss for any function in terms of its one-way communication complexity. We do not get 
into the details of this statement here. Instead, we give a weaker corollary of the present theorem that relates 
the privacy loss of a function to the Vapnik-Chervonenkis dimension (VC-dimension) of its communication 
matrix. 
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Definition 9 (VC-dimension) For a boolean valued function f : X xy ^ a setT (ly is shattered, 

if for all S T there is an x ^ X such that 'iy ^ T : f{x, y) = I <^ y ^ S. The VC-dimension of f for 
X, YCx{f), is the largest size of such a shattered set T <^ y. We define ^Cy{f) analogously. 

Informally, YCx{f) captures the size of the largest instance of the set membership problem SetMemb„ 
that can be 'embedded' into /. Using this connection, one can trivially prove a privacy trade-off result for 
/ in terms of YCx{f), ^Cy{f) by invoking Theorem[T] This generalises Klauck's lower bound BKlaOOII 
for the communication complexity of bounded error one-way quantum protocols for / in terms of its VC- 
dimension. 

Corollary 1 Let f : X x y ^ {0, 1} be a boolean valued function. Let YCxif) = n. Then there is a 
product distribution fi on X x y such that, if V is a quantum protocol for f with average error at most 
1/2 — e with respect to n, 

L^(/, t,,B,A)<k^ L^if, /X, A, B) > ^^_3(f,,^,,) - 2. 

An analogous statement holds for VCy{f). 

Proof: Since YCx{f) = n, there is a set T C 3^, |r| = n which is shattered. Without loss of generality, 
T = [n]. For any subset S <^T, there is an x G such that \/y £ T : f{x, y) = 1 <^ y £ S. We now give a 
reduction from SetMemb„ to / as follows: In SetMemb„, Alice is given a subset S C [n] and Bob is given 
a y G [n]. Alice and Bob run the protocol V for / on inputs x and y respectively, to solve SetMemb„. The 
corollary now follows from Theorem [T] ■ 
The following consequence of Corollary [T] is immediate. 

Corollary 2 Quantum protocols for set membership SetMemb„, set disjointness for subsets of [n] andinner 
product modulo 2 in {0, 1}" each suffer from il(logn) privacy loss. 

Proof: Follows trivially from Corollary [T] since all the three functions have VC-dimension n. ■ 

4 The substate theorem 

In this section, we prove the quantum substate theorem. But first, we state a fact from game theory that wiU 
be used in its proof. 

4.1 A minimax theorem 

We will require the following minimax theorem from game theory, which is a consequence of the Kakutani 
fixed point theorem in real analysis. 

Fact 7 Let yli, A2 be non-empty, convex and compact subsets of for some n. Let u : Ai x A2 ^^be 
a continuous function, such that 

• Va2 € A2, the set {ai G Ai : Va'^ G Ai u(ai, 02) > u{a'i,a2)} is convex; and 

• Vai G Ai, the set {02 G A2 : G A2 u(ai, 02) < u(ai, 02)} is convex. 
Then, there is an (a^ , 03 ) G Ai x A2 such that 

max min u{ai,a2) = u(a^,a2) = min max u{ai,a2)- 
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Remark: The above statement follows by combining Proposition 20.3 (which shows the existence of Nash 
equilibrium a* in strategic games) and Proposition 22.2 (which connects Nash equilibrium and the min-max 
theorem for games defined using a pay-off function such as u) of Osborne and Rubinstein's IIOR941 pages 
19-22] book on game theory. 

4.2 Proof of the substate theorem 

We now state the quantum substate theorem as it is actually used in our privacy lower bound proofs. 

Theorem 2 (Quantum substate theorem) Consider two Hilbert spaces H and /C, dim(/C) > dim('H). 
Let denote the two dimensional complex Hilbert space. Let p, a be density matrices in Ti. Let r > 1 
be any real number. Let k := 5(/9||cj). Let \ip) be a purification of p in 7i ® KL. Then there exist pure 
states \(l)),\9) G Ti IC and \() G Ti K, ® C^, depending on r, such that \C,) is a purification of a and 



the fidelity of p and a. 

Overview of the proof of Theorem |2t As we have mentioned earlier, our proof of the quantum substate 
theorem goes through first by defining a new notion of distinguishability called observational divergence, 
D{p\\a), between two density matrices p, a in the same Hilbert space H. Informally speaking, this notion 
is a single observational version of relative entropy. Truly speaking, the substate theorem is a relationship 
between observational divergence and the substate condition. We first prove an observational divergence 
lifting theorem which shows that given two states p,a inTL and any extension a' of a in (g) /C, dim(/C) > 
dim('H), one can find a purification {(p) of p in /C such that ||(t) = 0{D{p\\a)). This theorem 

may be of independent interest. This helps us reduce the statement we intend to prove only to the case when 
p is a pure state. This case is then further reduced to analysing only a two dimensional scenario which is then 
resolved by a direct calculation. The final statement of the quantum substate theorem in terms of relative 
entropy is established by showing that observational divergence is never much bigger than relative entropy 
for any pair of states. 

Let us begin by defining observational divergence. 

Definition 10 (Observational divergence) Let p, a be density matrices in the same Hilbert space H. Their 
observational divergence is defined as 



\U){i;\-\(P)m\,^<2/^,where 




t\0)\0) and k' := k + AVk + 2 + 2 log(A: + 2) + 5. 



Remarks: 

1. Note that Result|2]in the introduction follows from above by tracing out /C C^. 

2. From Result|2l one can easily see that \\p — a\\^j. < 2 — 2^'^^'^\ This implies a 2^'^^^^ lower bound on 




where F above ranges over POVM elements on TC such that Tr (Fa) ^ 0. 



The following properties of observational divergence follow easily from the definition. 



Proposition 1 Let p, a be density matrices in the same Hilbert space H. Then 
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1. D[p\\a) > 0, with equality iff p = a. 

2. D{p\\a) < +00 iffsupp{p) C supp((7). If D{p\\a) < +oo, then there is a POVM element F which 
achieves equality in Definition [70] 

3. D{-\\-) is continuous in its two arguments when it is not infinite. 

4. (Unitary invariance) //■ [/ is a unitary transformation on Ti, D{U pU'^\\UaU'^) = D{p\\a). 

5. (Monotonicity) Suppose fC is a Hilbert space, and p',a' are extensions of p,a in Ti ® KL. Then, 
^{p'\W) — ^ip\W)- implies, via unitary invariance and the Kraus representation theorem, 
that if T is a completely positive trace preserving superoperator from H to a Hilbert space C, then 
D{Tp\\Ta) < D{p\\a). 

Fact |4] and Proposition [T] seem to suggest that relative entropy and observational divergence are similar 
quantities. In fact, the relative entropy is an upper bound on the observational divergence to within an 
additive constant. More properties of observational divergence as well as comparisons with relative entropy 
are discussed in the appendix. 

Proposition 2 Let p, a be density matrices in the same Hilbert space 7i. Then, D{p\\a) < S{p\\a) + 1. 

Proof: By Fact|4]and Proposition [T] D{p\\a) = +oo iff supp(/3) ^ supp((T) iff S{p\\a) = +00. Thus, we 
can henceforth assume without loss of generality that D{p\\a) < +00. By Proposition [H there is a POVM 
element F such that D{p\\a) = p\og{p/q), where p := Tr (Fp) and q := Tr (Fa). We now have 

Sip\\a) > plog^ + (l-p) log -[1:14 > l'log- + (l-p)log-^-l > plog^-1 
q Q (1-9) Q 

= ^(Plk)-l- 

The first inequality follows from the Lindblad-Uhlmann monotonicity of relative entropy (FactHJl, and the 
second inequality follows because (1 — p) log(l — p) > (— loge)/e >— 1, forO<p<l. This completes 
the proof of the lemma. ■ 
We now prove the following lemma, which can be thought of as a substate theorem when the first density 
matrix is in fact a pure state. 

Lemma 1 Let {ip) be a pure state and a be a density matrix in the same Hilbert space Ti.. Let k := 
D {ip\)\\a) . Then for all r > 1, there exists a pure state \(/)), depending on r, such that 

lll^)(^|-|'/')('/'llltr<^ and (^^^ |0)(0| < a. 

Proof: We assume without loss of generaUty that < < +00. Consider M := a — {\ip){ip\/2^'^). Since 
-{\'4)){'4)\/T^) has exactly one non-zero eigenvalue and this eigenvalue is negative viz. —1/2'''^, and a is 
positive semidefinite, M is a hermitian matrix with at most one negative eigenvalue. 
If Af > we take \(f)) to be The lemma trivially holds in this case. 

Otherwise, let \ w) be the eigenvector corresponding to the unique negative eigenvalue — q of M. Think- 
ing of \w){w\ as a POVM element, we get 

> -a = Tr (M|.)(.|) = H^k) - ^ ^ < 
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Hence 



In particular, this shows that \ijj),\w) are linearly independent. 

Let n := dim(7^). Let {\v), \ w)} be an orthonormal basis for the two dimensional subspace of H 
spanned by \w)}. Extend it to {|fi), . . . , \vn-2), \v), Iw)}, an orthonormal basis for the entire space 
H. In this basis we have the following matrix equation, 



" F 


e 


d 


et 


a 


b 




6t 


c 







ot 
ot 







X y 
1/t r 



P 



I 



—a 



(4) 



where the first, second and third matrices are a, \iIj){iIj\/2'^^ and M respectively. F is an (n — 2) x (n — 2) 
matrix, P is an (n — 1) x (n — 1) matrix, d, e are (n — 2) x 1 matrices and Z is an (n — 1) x 1 matrix, 
a, c, X, a are non-negative real numbers and 6, y are complex numbers. The zeroes above denote all zero 
matrices of appropriate dimensions. The dagger denotes conjugate transpose. 

Claim 2 We have the following properties. 

1. b,y E C, a, c, x, z,a E R. 



2. b = y^O, l/{r2''^) >z = c + a>c>0, a>0, a>0, 0<x< l/2'■^ x + z = l/2'■^ l = Oand 



rk 



^rk 



d = 0. 
J. u < 1^ 



< 



Proof: The first part of the claim has already been mentioned above. Since \w) is an eigenvector of M 
corresponding to eigenvalue —a, Z = 0. By inspection, we have b = y, z = c + a,d = 0. We have a; > 
since |^), \w) are linearly independent, and z > c> since a > 0. Now, x + z = Tr {\tl;){ip\/2^'^) = 1/2''^ 



and so X < 1/2''^. Also, z = \{ijj\w)\^ /2'''' < l/(r2'''=). Since a > 0, F > and 



rk 



^rk\ 



a b 
6t c 



> 0. Hence, 



det 



a b 
6t c 



ac- \bf > 0. 



|V')(V'|/2'''''has one dimensional support. 



det 



X y 



y 



xz 



0. 



If c = then y = b = 0, which implies that xz = 0, which is a contradiction. Hence, c > and 6/0. 
Similarly, a > 0. This proves the second part of the claim. The third part now follows easily. ■ 
We can now write a = ai + a2, where 



F 















Ot 



a-^ 

ot 



and (72 := 



ot M! 



ot 



6t 
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Note that |0 = (0, ... ,0, 1, —b'^/c) is an eigenvector of (T2 corresponding to the eigenvalue 0. We have 
o"2 > 0, and in fact, o"2 has one dimensional support. We now claim that ai > 0. For otherwise, since 
F > 0, there is a vector 16*) of the form (ai , . . . , a„_2, 1,0) such that {9\ai\6) < 0. Now consider the vector 
\6') := (ai, . . . , an-2, 1, -bye). We have. 



r\a\e') = {e'\ai\6') + {e'\a2\e') 



contradicting cr > 0. This shows that ai > 0, and hence, a > a2. 

We are now finally in a position to define the pure state | (p) . Note that 
to have unit trace. That is. 



{e\ai\e) + {c\a2\0 <o, 

is nothing but a2 normalised 



■4- + c 



Using Claim |2] we get. 



TTa2 = — + c> + c 

c z 



X + z — a > 



Hence, < o'2 < o". This shows the second assertion of the lemma. 

To complete the proof of the lemma, we still need to show that 
global phase factors, one can write IV'), as follows: 

b 



Itr 



is small. Up to 



^M) + ^z\w) 



^\v)+V~c\w) 



— — h z 



4- + c 



We now lower bound 



as follows, using Claim [2l 



+ \fcz 



^-^ — V z 



V(I&P+C2)(|6|2 + Z2 



> 



> 



|6|2+CZ 



cz 



2 + z2) 



>|2 + CZ 
|5|2 + z2 



X + c 
X + z 



X + z 



1 

1 - -. 

r 



This proves that || — I0)(0llltr ~ 2y^l — |((/)|7/')|2 < 2/-y/r, establishing the first assertion of the 
lemma and completing its proof. ■ 
We next prove the following lemma, which can be thought of as an 'observational substate' lemma. 

Lemma 2 Consider two Hilbert spaces TC and fC, dim(/C) > dim('H). Let p, a be density matrices in TC. 
Let be a purification of p in TC fC. Let F be a POVM element onTC <S> IC. Let /? > 1. Then there exists 
a purification j^) of a inTL K, such that q > -^pj^, where p := Tr q := Tr {F\(j}) and 

k' := PD{p\\a) - 21og(l - p-^^). 

Proof: We assume without loss of generality that < D{p\\a) < +cxd and that p > 0. Letn := dim{Tl^JC) 
and {loj)}"^^ be the orthonormal eigenvectors of F with corresponding eigenvalues {Xi}^^^. Note that 
< Ai < 1 and \ai) € W /C. We have. 



p = ^ Ai|(Q;i|V')P and q = Ai|(a, 

i=l 



i=l 
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Define, 



Er=iA.(«.|V')|a.) ^^^^ ,^ p_ 



Note thatp = KV'I^')!^!!!^')!!^ and < II 16*') IP < 1. Using tlie Caucliy-Scliwarz inequality, we see that 

|2 n 

<^Ai|(a,|</.)p = g. 



Er=iA.i(«.h''^|^ 



=1 



Thus, 



P K^I^)Plll^')f < l/^'.I^M2lim'Ml2 



2fc7p 2fc'/{l(^|e>P|||e'>i|2) - 2fc'/l('A|9>P ■ 
Hence, it will suffice to show that there exists a purification 1 0) of a in /C such that 

|2 

2 > 



2k'/\{m\^ • 

Define the density matrix r in 7^ as r := TVyc |^)(^|- By Facts |2] and [H there is a purification of a in 
W /C and a POVM {Fi , . . . , in W such that, 

Me)\=B{T,a)=Y,V^^, 
1=1 

where c, := Tr (Fir) and 6j := Tr {Ficr). Let := Tr (Fip). We know from Facts |2] and [3] that 

I 

o<v^<KVI^>| <B{T,p)<Y,V^i- 

i=l 

Note that the Oj's are non-negative real numbers summing up to 1, and so are the 6j's and the q's. 

For /3 > 1, define the set := |i G [/] : at > bi ■ 2/^'=/-B(^.p)' |, where /fc := D{p\\a). Note that 

\/i ^ S,bi as supp(p) C supp((7), A; being finite. Define the POVM element G on as G := J2ieS,3 ^i- 

Let a := Tr (Gp) and b := Tr (Gct). Then a = J2ieSp ^ = ^ieSp b > and a > b ■ 2f^^/^^^^P^\ We 
have that 

Dip\\a) = k>alog->^^^^a<^j^. 

Now, by the Cauchy-Schwarz inequality and the other inequalities proved above, we get 
I 

B{t,p) < y^^^/c^i = ^ \/Qa^+ ^ y/c^i 

i=l ie-S^ i^Sp 



< ^^(;i^ + 2/3^/(2S(-.^)^)i?(r,a). 

This shows that 

B{r, pf < (1 - r • 2'3'^/^(-''')'i?(r, a)^ ^ |(^|0)|2 < (l _ /3-V2)-2 . 2/5A./l(^|e>l^ 
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Since k' = /3k-2 log(l - /J^^/^), we get | ((/>|6l)p > completing the proof of the lemma. ■ 

In the previous lemma, the purification {(p) of a was a function of the POVM element F. We now prove 
a lemma which, for any fixed < p < 1, removes the dependence on F satisfying Tr {F\tlj) ('01) > p, at the 
expense of having a, in general, mixed extension of a in the place of a pure extension i.e. purification. 

Lemma 3 Consider two Hilbert spaces TC and fC, dim(/C) > dim('H). Let p, a be density matrices in TC 
and \ tp) be a purification of p inTi ® KL. Let < p < 1 and (3 > 1. Then there exists an extension uj of a 
inH^lC such that for all POVM elements F onH^KL such that Tr (F| V') (V'l ) > P, Tr {Fuj) > p/2^'/P 
where k' := PD{p\\a) - 21og(l - p'^'^). 

Proof: We assume without loss of generality that < D{p\\a) < +oo and that p > 0. Consider the 
set Ai of all extensions of o" in 7^ (g) /C and the set A2 of all POVM operators F inH ® IC such that 
Tr {F\^p){^p\) > p. Observe that Ai, A2 are non-empty, compact, convex sets. Without loss of generality, 
A2 is non-empty. The conditions of Fact |7] are trivially satisfied (note that we think of our matrices, which 
in general have complex entries, as vectors in a larger real vector space). Thus, for every F £ A2, we have 
a purification \(/)^) JC of a such that 

(^'"^ I) ^ 2^VTr (F|^)(^|) ^ W7^- 

Using Fact|7J we see that there exists an extension w of u in H /C such that Tr (Flu) > for all 
F G yli. This completes the proof. ■ 
The previous lemma depends upon the parameter p. We now remove this restriction by performing 
a 'discrete integration' operation and obtain an observational divergence 'lifting' result, which may be of 
independent interest. 

Lemma 4 (Observational divergence lifting) Consider two Hilbert spaces H, IC, dim(/C) > dim('H). Let 
p, a be density matrices in 7i, and \ip) be a purification of p in Ti® K. Then there exists an extension lo of 
ainn(g>}C such that D{{\i;){i;\) \\lo) < D{p\\a) + Ay'DipWa) + 1 + 2 log{D{p\\a) + 1) + 4. 

Proof: We assume without loss of generality that < D{p\\a) < +00. Let [3 > 1 and 7 > 1. Define the 
monotonically increasing function / : [0, 1] [0, 1] as follows: 

f(Py-=^ ^1^^^^ 0<p<l and k' ■.= (3D{p\\a)-2logil-(3-^/^). 

For a fixed positive integer /, define T^{1) := Yl\=i It is easy to see by elementary calculus that 

< r^(/) < 7-1 •(/ + l)T. Define the density matrix in?^®/Cascj; := (r^(/))"^ ELi ^'^"^^(VO' 
where for < p < 1, a;(p) is an extension of a inH IC such that Tr {Fu>{p)) > f{p) for all POVM 
elements F onH ® IC satisfying Tr (Flip) {ipD > p. Such an Lo{p) exists by Lemma[3l Then, Tr^: uji = a 
i.e. uji is an extension of o" in /C. 

Suppose F is a POVM element onH^K. Let j/l < p := Tr (F| V') (V'l) < (i + 1) /I, where <j <l. 
We assume without loss of generality that p > 0. Then, 

j j 
TriFu^i) = J-^.7-i.Tr(F^(V/)) > -i-j;i^-i./(VO 

> hill f(j_\-'l\ = fCh±i^ 

- T,{1) ^l^T^(i)^ / I T,{1) ^\l-T,{j) 
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3 \'f 



l + lj VK7+1)-(J + 1)^ 



The second inequahty above follows from the convexity of /(•). By compactness, the set {a;; : / G N} has 
limit points. Choose a limit point point lo. By standard continuity arguments, Ttjc uj = a and 



Tr{Fuj) > lim 



1 f I ^p 

P^ -f { ——r 

7 + 1 



7 • p'^'^^ 



Hence, q > and 

plog^ < p log (^7-1(7 + l)-p-^-2'='(^+i)^"'P"') = plog{l + j-^)-jplogp+{l+j~^)k' 
< (l+7-i)A:' + 7 + l. 

The second inequality follows because — plogp < 1 for < p < 1, and log(l +7^^) < 1 for all 7 > 1. 
Substituting k' = pD{p\\a) - 21og(l - P''^/'^) gives 

D{{\^P){^P\) \\co) < /3(1 + r')D{p\\a) - 2(1 + 7"^) log(l - +7 + 1- 

We set /3 = (1 + {D{p\\a) + 1)^1/2)2 ^ ^ {D{p\\a) + l)^/^ to get 

11^) < (1 + mp\\a) + ly'/'f • (l + {D{p\\a) + 1)-V2) . D{p\\a) 

+ (1 + iDip\\a) + ir^) ■ logp(plk) + 1) + {Dip\\a) + I)'/' + 1 

< D{p\\a) + VZ?(p||a) + l + (1 + iD{p\\a) + 1)-V2) . log{D{p\\a) + 1) + 4 

< D{p\\a) + Ay/D{p\\a) + 1 + 2log{D{p\\a) + 1) + 4. 

This completes the proof of the lemma. ■ 
Lemma |4] relates the observational divergence of a pair of density matrices to the observational diver- 
gence of their extensions in an extended Hilbert space, where the extension of the first density matrix is a 
pure state. Using this, we are now finally in a position to prove the quantum substate theorem. 

Proof (Theorem mi: By Proposition |2] and Lemma |4j there exists a density matrix lo in Ti JC such that 

Ttjc uj = a and 

D{m{tl;\)\\u:) < D{p\\a) + A VD{p\\a) + 1 + 2 log(Zj(p||a) + 1) + 4 

< Sip\\a) + WS{ph) + 2 + 2 log{Sip\\a) + 2) + 5 = k' . 
By Lemma[T] there exists a pure state 10) such that 

■I'^)('^llltr<^ and (^)|<^)(<A|<u;. 

Let Ti := Tv/c \4'){4'\- By above, ( -^^7^-^ ti < a. That is, there exists a density matrix T2 in TC such that 

1\ / r-V 



^=^7^j^i+^i-7^j^2. 

Let |0) G H IChe a canonical purification of T2. Then, \() defined in the statement of Theorem [2] is a 
purification of a inH K, C'^. This completes the proof of Theorem |2l ■ 
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5 Conclusion and open problems 



In this paper we have proved a theorem about relative entropy of quantum states which gives a novel inter- 
pretation to this information theoretic quantity. Using this theorem, we have shown a privacy trade-off for 
computing set membership in the two-party quantum communication model. 

The statements of the classical and quantum substate theorems have one important difference. For two 
quantum states p, a with ^(pllcr) = k, the distance between p and p' , where p' /2^'^^'^ < a, is less in the 
classical case than in the quantum case. More formally, the dependence on r in Theorem [2] is 0{l/y/r) 
whereas in the classical analogue. Result |2|, the dependence is like 0(l/r). The better dependence in the 
classical scenario enables us to prove a kind of converse to the classical substate theorem, which is outlined 
in the appendix. It will be interesting to see if the dependence in the quantum setting can be improved to 
match the classical case, enabling us to prove a similar quantum converse. 

Another open question is if there is an alternate proof for the quantum substate theorem which does 
not go through observational divergence lifting. Finally, it will also be interesting to see find yet more 
applications of the classical and quantum substate theorems. 
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A Relationships between three distinguishability measures 

In this paper we have seen two measures of distinguishability between quantum states viz. relative entropy 
and observational divergence. The substate theorem gives a connection between observational divergence 
and a third measure of distinguishabiUty between quantum states, which we call the substate property. We 
define three variants of the substate property below, and study the relationships between them and relative 
entropy and observational divergence. 

Definition 11 (Substate property) Let p, a be two quantum states in the same Hilbert space TC. They 
are said to have the fc-substate property if for all r > 1, there exists a quantum state p{r) in Ti such 
that \\p — p(r)||^j. < 2/r and p{'<') ^ cr- They are said to have the weak k-substate property if 

\\p — p{r) ll^j. is upper bounded by 2/-y/r instead of^jr. They are said to have the strong k-substate property 
ifp/2k < a. 

The next proposition Usts some easy consequences of the definition of substate property. 
Proposition 3 Let p, a be density matrices in the same Hilbert space Ti. Then 

1. If p,CF satisfy the k-substate property, then k > with equality iff p = a. 

2. p, a satisfy the k-substate property with k < +oo iffsupp{p) C supp((T). 

3. (Unitary invariance) If U is a unitary transformation on H, then p, a satisfy the k-substate property 
iff Up, U a satisfy the k-substate property. 
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4. (Monotonicity) Suppose fC is a Hilbert space, and p\ a' are extensions of p, a in 1-L ® K,. If p' , o' 
satisfy the k-substate property, then p, a satisfy it also. This implies, via unitary invariance and the 
Kraus representation theorem, that ifT is a completely positive trace preserving superoperator from 
Tito a Hilbert space C, then if p, a satisfy the k-substate property, T p, Ta do so also. 

Similar statements hold for the weak and strong k-substate property also. 

The following proposition states various relationships between our three measures of distinguishability 
that we have mentioned earlier. 

Proposition 4 We have: 

1. (Classical substate theorem) Two probability distributions P, Q on [n] with D{P\\Q) = k satisfy the 
k-substate property. 

2. (Quantum substate theorem) Two quantum states p, a in C" with ^(pllfT) = k satisfy the weak k'- 
substate property with k' = k -\- ^s/k + 1 + 2 \og{k + 1) + 4. 

3. If quantum states p, a in C" have the k-substate property, then D{p\\a) <2k-\-2. 

4. If quantum states p, a in C" have the strong k-substate property, then S{p\\(j) < k. 

5. For any probability distributions P, Q on [n], D{P\\Q) - 1 < S{P\\Q) < D{P\\Q){n - 1). 

6. For any quantum states p,a in C", D{p\\a) — 1 < ^(pllo") < D{p\\a){n — 1) + logn. 

7. There exist probability distributions P, Q on [n] such that S{P\\Q) > (^WQl _ 1^ _ 2) - 1. 

8. For any two quantum states p, a in C", there exists a two-outcome POVM T on C" such that 
S{p\\a) > 5(^p||^a) > mEU2^ _ 1. 

Remarks: 

1. From Parts [T] and |4] of Proposition |4l we see that the classical substate theorem (Result|2j) has a converse. 

2. Unfortunately, we are unable to prove a converse to the quantum substate theorem (Result O as Part|2]of 
Proposition m only guarantees a weak substate property between the two quantum states p, a. 

3. Part[8]of Proposition |4]is a counterpart to monotonicity of relative entropy (Fact|4ll. 

Proof (Proposition IDl: 

1. Without loss of generality, A; > 0. Let r > 1. Define the set Bad := {i € [it] : P{i)/T'^ > Q{i)}. 
Then, 

k = D{P\\Q) > P(Bad) log ^tI^ > ^'(Bad) • rk P(Bad) < -, 

Q[Daa) r 

which is the same as expression ^ in Section 11.21 We can now argue similarly as in the proof of 
Result |2f to prove Part [U of the present proposition. 

2. Follows from Lemmas |4] and [T] 
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3. Without loss of generality, < /ci := D{p\\a) < +00. Let F be a POVM element in such that 

ki =plog{p/q) ^q = 

where p := Tr (Fp) and q :=Ti (Fa). Note that p > 0. Let r := 2/p. Since p, a have the fc-substate 
property, let p' be the quantum state in C" such that ||p — /o'lltj. < ^ = p and (^tt) p' ^ f'"- Define 
:= Tr {Fp'). Then, > p/2. Also, 



2^1 /p \ 2^^ I ^ " ' \ 2/ 2^^ 

> 

The last inequality above follows because p < 1 and p' > p/2. This implies that 

rk + 2> — ^ p{rk + 2) > ki ^ 2k + 2 > ki, 
p 

where the second implication follows because p < 1 and p = 2/r. This completes the proof of Part[3] 
of the present proposition. 

4. Without loss of generality, k < +00. We have 

5(p||(t) = Tr plogp — Tr plogfi < Tr plogp — Tr plog = k - Tr p = k. 

2« 

The inequality above is by monotonicity of the logarithm function on positive operators l|Low341 . 

5. Without loss of generality, < D{P\\Q) < +00. The lower bound on S{P\\Q) was proved in 
Proposition |2] Define Xi = \og{pi/qi). We can assume without loss of generality, by perturbing Q 
slightly, that the values Xi are distinct for distinct i. Let 5' = : Xj > 0}. Let k := D{P\\Q). Let 
For all positive /, define Si := {i e [n] : Xi> I}. Therefore, 

k > Pt[Si] log > Pv[Si]l => Pv[Si] < k/l. 

Assume without loss of generality that xi < X2 < • • • < Xn- Then if Xi > 0, Prp[S'^.] < k/xi. 
Since S{P\\Q) < Zie S' PiXi, the upper bound on S{P\\Q) is maximised when S' — {2, . . . , n}, 
Pn = k/xn,Pi = k{l/xi - 1/xj+i) for alH E {2, . . . ,n - 1}, andpi = 1 - YA=2Pi- Then, 

n 71—1 n— 1 n— 1 



S{P\\Q) < ^KXi = fc^Xi(l/xi-l/xi+i) + fc = ^ + A: < fc^l + A; 

i=2 

= k{n-l). 



Xi4-l 

i=2 i=2 i=2 ^ i=2 



6. Without loss of generality, < D{p\\a) < +00. The lower bound on S{p\\a) was proved in Proposi- 
tion |2] Let us measure p and a in the eigenbasis of a. We get two distributions, P and Q. Below, we 
will sometimes think of P, Q as diagonal density matrices. From Part |5] of the present proposition, it 
follows that 

D{P\\Q){n-l) > 5(P|Q) = Tr(P log P)-Tr(P log Q) > -log n-Tr (Flog Q) 
= — log n — Tr (p log o") = — log n + 5(p||(t) — Tr (plogp) 
> -logn + S{p\\a). 
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The second equality above holds since the measurement was in the eigenbasis of a. 
Thus, 

S{p\\(j) < D{P\\Q){n - 1) + logn < D{p\\a){n - 1) + logn, 
where the second inequality is by mono tonicity of observational divergence (Proposition [T). 

7. Fix a > 1, A; > 0. Define for all i G {2, . . . ,n — 1}, pi := a~^{a — 1), and pi := a~^{a — 1), 
Pn ■■= a~("~^). Define for all i £ {2,... qi := pi2~^"-'~^ , and ■= I - Y2=2^i- Define 
P •= (pit ■ ■ ,Pn), Q '■= {qi, ■ ■ ■ ,Qn)', P,Q probability distributions on [n]. For any r > 1, 
consider P := {pi, . . . ,P|"iog^r]+i) 0, . . . , 0) normalised to make it a probability distribution on [n]. 



It is easy to see that 
property, hence D{P\ 
Now, 



P-P 



< 2/r and ^^^2^1^ — Q- "^^^^ shows that P, Q satisfy the fc-substate 



Q) < 2(/c + 1) by Part[3]of the present proposition. 

k{a - 1) 



n n 



S{P\\Q) = Vpaog^ > pilogpi+Tp^log^ > -l + (n-2)^ '- + k 

, kin -2) 

= k(n-l) ^ '--I. 

a 

The second inequality above follows because plogp > —1 for all < p < 1. By choosing a large 
enough, we can achieve S{P\\Q) > k{n — 2) — 1. This completes the proof of Part|7]of the present 
proposition. 

8. The upper bound on S{!Fp\\J^a) follows from the monotonicity of relative entropy (Fact|4l). Without 
loss of generality, < S'(p||cr) < +oo. We know that there exists a POVM element F in C" such 
that D{p\\a) = p\og{p/q), where p := Tr Fp and q := Tr Fa. Define the two-outcome POVM 
on to be {F, 1 — F), where 1 is the identity operator on C". Then, the probability distributions 
^P = {pA — p) and jFcr = (g, 1 — g). Note that 

S{Tp\\Ta)=p\og^ + {l-p)\og\^>p\og^-l = D{p\\a)-l, 
q l-q q 

where the inequality follows because xlogx > — 1 for all < x < 1. From Part [6] of the present 
proposition, it follows that 

Sip\\a) < D{p\\a)in -1) + logn < iS{J^p\\J^a) + l){n-l)+ logn 

^SCFpW^a) > Siph)-logn _^^ 

n — 1 
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